Evolutionary history of tyrosine-supplementing endosymbionts in pollen-feeding beetles

Abstract Many insects feeding on nutritionally challenging diets like plant sap, leaves, or wood engage in ancient associations with bacterial symbionts that supplement limiting nutrients or produce digestive or detoxifying enzymes. However, the distribution, function, and evolutionary dynamics of microbial symbionts in insects exploiting other plant tissues or relying on a predacious diet remain poorly understood. Here, we investigated the evolutionary history and function of the intracellular gamma-proteobacterial symbiont “Candidatus Dasytiphilus stammeri” in soft-winged flower beetles (Coleoptera, Melyridae, Dasytinae) that transition from saprophagy or carnivory to palynivory (pollen-feeding) between larval and adult stage. Reconstructing the distribution of the symbiont within the Dasytinae phylogeny unraveled not only a long-term coevolution, originating from a single acquisition event with subsequent host–symbiont codiversification, but also several independent symbiont losses. The analysis of 20 different symbiont genomes revealed that their genomes are severely eroded. However, the universally retained shikimate pathway indicates that the core metabolic contribution to their hosts is the provisioning of tyrosine for cuticle sclerotization and melanization. Despite the high degree of similarity in gene content and order across symbiont strains, the capacity to synthesize additional essential amino acids and vitamins and to recycle urea is retained in some but not all symbionts, suggesting ecological differences among host lineages. This report of tyrosine-provisioning symbionts in insects with saprophagous or carnivorous larvae and pollen-feeding adults expands our understanding of tyrosine supplementation as an important symbiont-provided benefit across a broad range of insects with diverse feeding ecologies.


Amino acids and B-vitamin biosynthesis pathways in the symbiont genomes
Differences in symbiont genome lengths were also reflected in their metabolic capability (Figure 5); however, some capabilities were universally retained.All Dasytiphilus strains retained glycolysis and pentose phosphate pathways, both utilizing ß-D-Fructose-6P as the starting metabolite.However, the citrate cycle was incomplete, as the symbionts only encode genes for the enzymatic steps from 2-oxoglutarate to oxaloacetate via succinate.Furthermore, all Dasytiphilus encoded genes for the complete tyrosine and phenylalanine biosynthetic pathways (aroB, aroQ, aroE, aroK/L, aroA, aroC, aroF/G/H, tyrA and aspC).For the final steps in these pathways, aspartate aminotransferase aspC was encoded, which can take over the enzymatic activity usually performed by the more common aromatic-amino-acid transaminase encoded by tyrB (1)(2)(3).However, these final steps to tyrosine and phenylalanine can also be taken over by the host (4)(5)(6).Furthermore, Dasytastes-clade and Listrus-clade symbionts encoded pheA in addition to tyrA for the reaction from prephenate to phenylpyruvate.Listrus sp.07 was missing the aroQ gene; however, it is likely that this gene absence was an artefact resulting from the low assembly quality of this specific symbiont's draft-genome consisting of 90 contigs with a coverage below 10.All Dasytastes-clade symbionts and several Listrus-clade symbionts had incomplete but likely functional lysine biosynthesis diaminopimelate pathways.All of these symbionts lacked the argD/dapC gene.Additionally, the Listrus-clade symbionts lacked the lysA gene.Furthermore, the dapE gene was absent in Listrus sp.00 and Listrus sp.09, and it was pseudogenized in Listrus sp.02, Listrus sp.04, and Listrus sp.07.Lastly, the pathway was completely lost in the symbiont of Listrus sp.01 as well as Dasytes-clade symbionts, and non-functional in the symbiont of Listrus sp.06 and Danacea-clade symbionts, where we found heavily pseudogenized gene remnants.The genes encoding for dapC/argD are commonly missing in closely related endosymbionts, e.g. in some Buchnera, Blochmania and Sodalis endosymbionts.It is therefore possible that this specific catalytic reaction in the lysine biosynthesis pathway is taken over by other enzymes.It was shown in Escherichia coli that phosphoserine aminotransferase (serC) can perform this step (7).Interestingly, the serC gene was present in all analyzed Dasytastes and Listrus symbionts which otherwise had a nearly complete lysine synthesis pathway, whereas it was missing in Danacea and Dasytes symbionts that lacked the complete lysine pathway.This indicates that the serC gene indeed takes over the dapC/argD gene function in the lysine pathway in Dasytiphilus.The dapE gene is also known to be lost from several other endosymbionts with reduced genomes, e.g.Tremblaya symbiont of mealybugs or Sulcia symbionts (8,9).Remnants of the gene were present in various Listrus-clade symbionts but appeared to be pseudogenized.
A direct candidate gene that might take over this enzymatic step could not be identified, but it is possible that this function is taken over by another aminotransferase gene.Moreover, for the sometimes missing lysA gene, there are indications that the speA gene can take over this enzymatic step at lower efficiency (10).However, this speA gene was present in all Dasytastesclade symbionts, but absent in Listrus-clade symbionts, and only the latter were missing the lysA gene.An alternative completion to this pathway might be done by the host.Many insects, e.g.mealybugs and whiteflies, encode lysA orthologs in their genomes (8,11).Another option is that the last step is not necessary because the symbiont only synthesizes lysine precursors as cellenvelope components.A similar scenario was hypothesized by Andersson et al. (12), but it seems unlikely in this case, as the very closely related Dasytastes symbionts still encoded for lysA.
Symbionts from the Dasytastes-clade also encoded genes for the pathways of the essential amino acids histidine, methionine, and threonine.Furthermore, Dasytastes-clade symbionts encoded genes for incomplete but probably functional biosynthetic pathways for the B-vitamins Riboflavin (B2), Pyridoxine (B6), and Folate (B9).The enzymatic pathway for riboflavin is branched and requires one molecule of guanosine 5'triphosphate (GTP) and two molecules of ribulose 5-phosphate.Dasytastes-clade symbionts encode genes for all necessary enzymes.There is some uncertainty about the enzymatic step that catalyzes the dephosphorylation of 5-amino-6-(5'-phospho-D-ribitylamino)uracil to 5-amino-6-(1-D-ribitylamino)uracil.This step is done by an enzyme that was elusive for a long time and only recently several candidates were published (13)(14)(15), all belonging to the haloacid dehalogenase (HAD) superfamily.The detected candidate genes are quite different and the similarity between coding sequences of these putative genes is low (15).Dasytastes-clade symbionts encoded the yigL gene, which was also found in Serratia endosymbiont of aphids, where it was hypothesized to encode an enzyme that catalyzes the missing enzymatic step (16).
Therefore, we speculate that the yigL gene fulfills the same function in Dasytiphilus and that this pathway is fully functional.
Moreover, symbionts of the Dasytastes-clade, with the exception of Dasytes sp.02, were presumably able to synthesize the vitamin folate (B9).They encoded all but two genes needed for the branched biosynthesis pathway that requires guanosine 5'-triphosphate (GTP) and chorismate.The missing enzymatic steps would be performed by alkaline phosphatase and dihydrofolate reductase.However, it was shown that the gene encoding alkaline phosphatase might to not get annotated well, and an alleged lack in genome annotations cannot necessarily be used to rule out a complete folate biosynthesis pathway (20).Furthermore, this particular function might also be taken over by other multifunctional phosphatases.The other enzyme, dihydrofolate reductase, is usually encoded by the gene folA.The genomes of several strains of Dasytiphilus across all four clades were annotated to carry this gene, including Dasytes sp.01 and Dasytes sp.02.Due to the high gene synteny in Dasytiphilus strains, we were able to find the folA gene in all symbionts.Even though several mutations occurred, protein blast identified the putative gene sequence as folA, thus we hypothesize this gene to be functional in all symbionts.

Other putative symbiont functions
Besides biosynthetic pathways encoding metabolites that are potentially provided to the host, further differences between the symbiont strains existed.Symbionts from the Dasytastes-clade encoded genes for urease (encoded by the genes ureA, ureB, and ureC) and auxiliary proteins (encoded by the genes ureD/ureH, ureE, ureF and ureG) that catalyzes the hydrolysis of urea (21).The symbionts were lacking the ureE gene, which encodes a metallochaperone that binds nickel and can deliver it to the apoprotein (22), where it is important for the maturation of the urease.However, it was shown that ureE is not fundamental for this process and the apoprotein can obtain the nickel and become functional in ureE-deleted mutants, albeit with lower efficiency (22,23).It is therefore conceivable that the Dasytastes-clade symbionts encode for a functional urease.The urease-catalyzed hydrolysis of urea provides ammonia that the bacterium could use to synthesize glutamine with the help of glutamine synthetase encoded by the glnA gene.
Subsequently, an enzyme complex encoded by carA and carB genes could use the glutamine to synthetize carbamoyl phosphate, a metabolite which functions as a precursor to aspartate and cytidine triphosphate (CTP) synthesis in the Dasytastes-clade symbionts.The latter pathway also yields glutamate.Alternatively, the carA and carB enzyme complex can utilize excess ammonia directly to synthesize carbamoyl phosphate if glutamate and therefore glutamine is limited (24).Some differences in the capability to synthesize metabolites important for the cell envelope existed.Whereas symbiont strains of the Dasytastes-clade could use the substrates glycerone phosphate or phosphatidylethanolamine to produce cardiolipin, all other symbionts only kept a gene for cardiolipin synthase to carry out the last enzymatic step.Additionally, symbiont strains in P. viridicoerulea, D. aeratus, D. plumbeus and D. virens lost the pathway to synthetize peptidoglycan.E, K). White square in panel F shows the area depicted at higher magnification in panels G-K, with the channels corresponding to A-E.Scale bars = 50 µm.In between the visible consumed pollen grains many eubacterial cells can be seen, some of which are also stained by the Dasytiphilus specific probe.The shape of the Dasytiphilus cells in the gut lumen was slimmer and more elongated compared to the Dasytiphilus cells kept intracellularly within the bacteriome (see Figure 4), potentially as a result of the different environment.

Figure S2 :
Figure S2: Bacterial community composition in Melyridae beetles given in relative abundance of bacterial amplicon sequence variants (ASVs) determined at family level by DADA2 analysis of Illumina 16S rRNA gene amplicons.Samples with less than 1,000 reads after removal of reads assigned to chloroplasts and mitochondria were excluded.Every bar represents a single individual, with DNA extracted from the whole body.The 2,000 most abundant bacterial ASVs are displayed with annotated family, remaining ASVs are grouped as "Other."NTC = No template extraction controls.

Figure S3 :
Figure S3: Phylogenetic reconstruction of the different Dasytiphilus strains based on a set of 49 COG and done by using an approximately-maximum-likelihood algorithm.Node labels indicate local support values, with values below 70 removed.

Figure S4 :
Figure S4: Genome synteny plot, comparing the gene order between the genomes of Dasytiphilus symbionts of hosts in the Dasytes-clade.The gene identity percentage of homologous proteins is based on amino acid sequences and indicated by differential grey values.The phylogeny on the left is based on a set of 49 COG and was reconstructed using an approximately-maximum-likelihood algorithm.

Figure S5 :
Figure S5: Tissue localization of Dasytiphilus symbionts via fluorescence in situ hybridization in adult Dasytes niger (A), Psilothrix viridicoerulea (B), and Dolichosoma lineare (C).Dasytiphilus (in green or yellow) are aggregated in the bacteriocytes.No general bacteria (in red) are visible.Cell nuclei stained in blue with DAPI.

Figure S6 :
Figure S6:Sagittal sections of the abdomen of Dasytes plumbeus.Using fluorescence in situ hybridization, Dasytiphilus symbionts were labeled specifically in magenta and non-specifically with a eubacterial probe in yellow.Background autofluorescence is given in white, and a general DNA counterstain in cyan (DAPI).Pictures A, F, and K show the overlaps of all four channels.Symbiont filled bacteriome (bac) is located towards the posterior of the abdomen (A-E).No bacterial signal was found in the ovaries (F-O) or in eggs that were still in the ovaries (ov).Scale bars = 50 µm.

Figure S7 :
Figure S7:Dasytiphilus symbionts were localized in the gut of an adult Dasytes plumbeus via fluorescence in situ hybridization of sagittal sections.The different panels show an overlay of all channels (A, F, G), eubacterial staining in yellow (B, H), Dasytiphilus specific staining in magenta (C, I), control staining is in white (D, J), and cell nuclei staining with DAPI in turquoise (E, K). White square in panel F shows the area depicted at higher magnification in panels G-K, with the channels corresponding to A-E.Scale bars = 50 µm.In between the visible consumed pollen grains many eubacterial cells can be seen, some of which are also stained by the Dasytiphilus specific probe.The shape of the Dasytiphilus cells in the gut lumen was slimmer and more elongated compared to the Dasytiphilus cells kept intracellularly within the bacteriome (see Figure4), potentially as a result of the different environment.

Figure S8 :
Figure S8: No Dasytiphilus symbionts were found via fluorescence in situ hybridization in adult Eschatocrepis constrictus (A), Gracilivectura pygidialis (B), and Trichochrous pallescens (C).Control staining is in white and cell nuclei stained in turquoise with DAPI.Pictures shown are overlaps of all four channels.Consumed pollen grains were visible in the gut in all species.Scale bars = 100 µm.

Figure S9 :
Figure S9: Representation of the circular genome of the Dasytiphilus symbiont of Dasytes niger.The inner gray line shows the relative GC content.Colored blocks depict genes separated on inner and outer circle based on direction of transcription, with color indicating the annotated functional KEGG categories.

Figure S10 :
Figure S10: Correlation of genome length with GC content of Dasytiphilus symbionts.Dot colors indicate phylogenetic clade of host taxon for each symbiont strain.The dotted line represents the trendline.

Figure S11 :
Figure S11: Absence and presence of individual genes from selected amino acid and B-vitamin pathways.Each arrow stands for an enzymatic step, with the encoding gene given in italics.Green arrow indicate that this gene is present in all analyzed Dasytiphilus strains.Blue arrows indicate that this gene is present in all Dasytiphilus strains from the Dasytastes-clade, but absent in all other strains.Yellow arrows indicate a patchy distribution across Dasytiphilus strains, with explanations given below.Black and white arrow indicate that this gene is missing in all Dasytiphilus strains.Rectangular boxes represent key metabolites.Phenylalanine pathway: gene pheA is only present in all Dasytastes-clade symbionts and all Listrus-clade symbionts, except in the endosymbiont (ES) of Listrus sp.01.Amino acid pathways starting from aspartate: Genes thrA and asd are only present in all Dasytastes-clade symbionts and all Listrus-clade symbionts, except in the ES of Listrus sp.01.Genes dapA, dapB, dapD, and dapF are only present

Table S1 :
Diagnostic and quantitative PCR primers, 16S microbial community amplicon primers and FISH probes used in this study.

Table S2 :
Regions of the 16S rRNA gene that were amplified during microbial community analysis for each sample.

Table S3 :
Sequencing systems and services used for metagenome sequencing of individual taxa.

Table S5 :
Adaptation of Table1, but giving the number of analyzed specimens for each individual analysis.A red number means that the respective analysis has shown that the symbiont was absent, green numbers represent symbiont presence.For analyses that only used a single specimen, always a female was used.